Cross-language information retrieval in Twenty-One: Using one, some or all possible translations?

نویسندگان

  • Djoerd Hiemstra
  • Franciska de Jong
چکیده

This paper gives an overview of the tools and methods for Cross-Language Information Retrieval (CLIR) that were developed within the Twenty-One project. The tools and methods are evaluated with the TREC CLIR task document collection using Dutch queries on the English document base. The main issue addressed here is an evaluation of two approaches to disambiguation. The underlying question is whether a lot of effort should be put in finding the correct translation for each query term before searching, or whether searching with more than one possible translation leads to better results? The experimental study suggests that in terms of average precision, searching with ambiguities leads to better retrieval performance than searching with disambiguated queries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Twenty One at TREC Ad hoc and Cross language track

This paper describes the o cial runs of the Twenty One group for TREC The Twenty One group participated in the ad hoc and the cross language track and made the following accom plishments We developed a new weighting algorithm which outperforms the popular Cornell version of BM on the ad hoc collection For the CLIR task we developed a fuzzy matching algorithm to recover from missing translations...

متن کامل

Twenty-One at TREC7: Ad-hoc and Cross-Language Track

This paper describes the o cial runs of the Twenty-One group for TREC-7. The Twenty-One group participated in the ad-hoc and the cross-language track and made the following accomplishments: We developed a new weighting algorithm, which outperforms the popular Cornell version of BM25 on the ad-hoc collection. For the CLIR task we developed a fuzzy matching algorithm to recover from missing trans...

متن کامل

Bilingual Dictionary Approach for Malay-English Cross-Language Information Retrieval

Cross-language information retrieval (CLIR) is the process of providing queries in one language and returning documents relevant to that query which is written in a different language. A popular approach to CLIR is to translate the query into the language of the documents being retrieved. One of the simplest and most effective methods for query translation is to perform dictionary look up based...

متن کامل

Resolving Translation Ambiguity using Monolingual Corpora. A Report on Clairvoyance CLEF-2002 Experiments

Choosing the correct target words is a difficult problem for machine translation. In cross-language information retrieval, this problem of choice is mitigated since more than one translation alternative can be retained in the target query. Between choosing just one word as a translation and keeping all the possible translations for each source word, one can apply a range of filtering techniques...

متن کامل

Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-language Information Retrieval

One of the aims of EuroWordNet (EWN) was to provide a resource for Cross-Language Information Retrieval (CLIR). In this paper we present experiments which test the usefulness of EWN for this purpose via a formal evaluation using the Spanish queries from the TREC6 CLIR test set. All CLIR systems using bilingual dictionaries must find a way of dealing with multiple translations and we employ a WS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999